The Ising model (or Lenz–Ising model), named after the physicists Ernst Ising and Wilhelm Lenz, is a mathematical model of ferromagnetism in statistical mechanics. The model consists of discrete variables that represent magnetic dipole moments of atomic "spins" that can be in one of two states (+1 or −1). The spins are arranged in a graph, usually a lattice (where the local structure repeats periodically in all directions), allowing each spin to interact with its neighbors. Neighboring spins that agree have a lower energy than those that disagree; the system tends to the lowest energy but heat disturbs this tendency, thus creating the possibility of different structural phases. The two-dimensional square-lattice Ising model is one of the simplest statistical models to show a phase transition.See , Chapters VI-VII. Though it is a highly simplified model of a magnetic material, the Ising model can still provide qualitative and sometimes quantitative results applicable to real physical systems.
The Ising model was invented by the physicist , who gave it as a problem to his student Ernst Ising. The one-dimensional Ising model was solved by alone in his 1924 thesis; Ernst Ising, Contribution to the Theory of Ferromagnetism it has no phase transition. The two-dimensional square-lattice Ising model is much harder and was only given an analytic description much later, by . It is usually solved by a transfer-matrix method, although there exists a very simple approach relating the model to a non-interacting fermionic quantum field theory.
In dimensions greater than four, the phase transition of the Ising model is described by mean-field theory. The Ising model for greater dimensions was also explored with respect to various tree topologies in the late 1970s, culminating in an exact solution of the zero-field, time-independent model for closed Cayley trees of arbitrary branching ratio, and thereby, arbitrarily large dimensionality within tree branches. The solution to this model exhibited a new, unusual phase transition behavior, along with non-vanishing long-range and nearest-neighbor spin-spin correlations, deemed relevant to large neural networks as one of its possible .
The Ising problem without an external field can be equivalently formulated as a graph maximum cut (Max-Cut) problem that can be solved via combinatorial optimization.
For any two adjacent sites there is an interaction . Also a site has an external magnetic field interacting with it. The energy of a configuration is given by the Hamiltonian function
where the first sum is over pairs of adjacent spins (every pair is counted once). The notation indicates that sites and are nearest neighbors. The magnetic moment is given by . Note that the sign in the second term of the Hamiltonian above should actually be positive because the electron's magnetic moment is antiparallel to its spin, but the negative term is used conventionally.See , Chapter 16. The Ising Hamiltonian is an example of a pseudo-Boolean function; tools from the analysis of Boolean functions can be applied to describe and study it.
The configuration probability is given by the Boltzmann distribution with inverse temperature :
where , and the normalization constant
is the partition function. For a function of the spins ("observable"), one denotes by
the expectation (mean) value of .
The configuration probabilities represent the probability that (in equilibrium) the system is in a state with configuration .
The system is called ferromagnetic or antiferromagnetic if all interactions are ferromagnetic or all are antiferromagnetic. The original Ising models were ferromagnetic, and it is still often assumed that "Ising model" means a ferromagnetic Ising model.
In a ferromagnetic Ising model, spins desire to be aligned: the configurations in which adjacent spins are of the same sign have higher probability. In an antiferromagnetic model, adjacent spins tend to have opposite signs.
The sign convention of H(σ) also explains how a spin site j interacts with the external field. Namely, the spin site wants to line up with the external field. If:
When the external field is zero everywhere, h = 0, the Ising model is symmetric under switching the value of the spin in all the lattice sites; a nonzero field breaks this symmetry.
Another common simplification is to assume that all of the nearest neighbors ⟨ ij⟩ have the same interaction strength. Then we can set Jij = J for all pairs i, j in Λ. In this case the Hamiltonian is further simplified to
For the Ising model without an external field on a graph G, the Hamiltonian becomes the following sum over the graph edges E(G)
Here each vertex i of the graph is a spin site that takes a spin value . A given spin configuration partitions the set of vertices into two -depended subsets, those with spin up and those with spin down . We denote by the -depended set of edges that connects the two complementary vertex subsets and . The size of the cut to bipartite graph the weighted undirected graph G can be defined as
where denotes a weight of the edge and the scaling 1/2 is introduced to compensate for double counting the same weights .
The identities
where the total sum in the first term does not depend on , imply that minimizing in is equivalent to minimizing . Defining the edge weight thus turns the Ising problem without an external field into a graph Max-Cut problem maximizing the cut size , which is related to the Ising Hamiltonian as follows,
and the system is disordered. On the basis of this result, he incorrectly concluded that this model does not exhibit phase behaviour in any dimension.
This was first proven by Rudolf Peierls in 1936, using what is now called a Peierls argument.
The Ising model on a two-dimensional square lattice with no magnetic field was analytically solved by . Onsager obtained the correlation functions and free energy of the Ising model and announced the formula for the spontaneous magnetization for the 2-dimensional model in 1949 but did not give a derivation. gave the first published proof of this formula, using a limit formula for Fredholm determinants, proved in 1951 by Szegő in direct response to Onsager's work.
where .
With , the special case results.
This means that spins are positively correlated on the Ising ferromagnet. An immediate application of this is that the magnetization of any set of spins is increasing with respect to any set of coupling constants .
This inequality can be used to establish the sharpness of phase transition for the Ising model.
Once modern quantum mechanics was formulated, atomism was no longer in conflict with experiment, but this did not lead to a universal acceptance of statistical mechanics, which went beyond atomism. Josiah Willard Gibbs had given a complete formalism to reproduce the laws of thermodynamics from the laws of mechanics. But many faulty arguments survived from the 19th century, when statistical mechanics was considered dubious. The lapses in intuition mostly stemmed from the fact that the limit of an infinite statistical system has many zero-one laws which are absent in finite systems: an infinitesimal change in a parameter can lead to big differences in the overall, aggregate behavior.
This argument works for a finite sum of exponentials, and correctly establishes that there are no singularities in the free energy of a system of a finite size. For systems which are in the thermodynamic limit (that is, for infinite systems) the infinite sum can lead to singularities. The convergence to the thermodynamic limit is fast, so that the phase behavior is apparent already on a relatively small lattice, even though the singularities are smoothed out by the system's finite size.
This was first established by Rudolf Peierls in the Ising model.
To do this, he compared the high-temperature and low-temperature limits. At infinite temperature (β = 0) all configurations have equal probability. Each spin is completely independent of any other, and if typical configurations at infinite temperature are plotted so that plus/minus are represented by black and white, they look like television snow. For high, but not infinite temperature, there are small correlations between neighboring positions, the snow tends to clump a little bit, but the screen stays randomly looking, and there is no net excess of black or white.
A quantitative measure of the excess is the magnetization, which is the average value of the spin:
A bogus argument analogous to the argument in the last section now establishes that the average magnetization in the Ising model is always zero.
As before, this only proves that the average magnetization is zero at any finite volume. For an infinite system, fluctuations might not be able to push the system from a mostly plus state to a mostly minus with a nonzero probability.
For very high temperatures, the magnetization is zero, as it is at infinite temperature. To see this, note that if spin A has only a small correlation ε with spin B, and B is only weakly correlated with C, but C is otherwise independent of A, the amount of correlation of A and C goes like ε2. For two spins separated by distance L, the amount of correlation goes as ε L, but if there is more than one path by which the correlations can travel, this amount is enhanced by the number of paths.
The number of paths of length L on a square lattice in d dimensions is since there are 2 d choices for where to go at each step.
A bound on the total correlation is given by the contribution to the correlation by summing over all paths linking two points, which is bounded above by the sum over all paths of length L divided by which goes to zero when ε is small.
At low temperatures (β ≫ 1) the configurations are near the lowest-energy configuration, the one where all the spins are plus or all the spins are minus. Peierls asked whether it is statistically possible at low temperature, starting with all the spins minus, to fluctuate to a state where most of the spins are plus. For this to happen, droplets of plus spin must be able to congeal to make the plus state.
The energy of a droplet of plus spins in a minus background is proportional to the perimeter of the droplet L, where plus spins and minus spins neighbor each other. For a droplet with perimeter L, the area is somewhere between ( L − 2)/2 (the straight line) and ( L/4)2 (the square box). The probability cost for introducing a droplet has the factor e−β L, but this contributes to the partition function multiplied by the total number of droplets with perimeter L, which is less than the total number of paths of length L: So that the total spin contribution from droplets, even overcounting by allowing each site to have a separate droplet, is bounded above by
which goes to zero at large β. For β sufficiently large, this exponentially suppresses long loops, so that they cannot occur, and the magnetization never fluctuates too far from −1.
So Peierls established that the magnetization in the Ising model eventually defines superselection sectors, separated domains not linked by finite fluctuations.
In the 19th century, it was thought that magnetic fields are due to currents in matter, and Ampère postulated that permanent magnets are caused by permanent atomic currents. The motion of classical charged particles could not explain permanent currents though, as shown by Joseph Larmor. In order to have ferromagnetism, the atoms must have permanent which are not due to the motion of classical charges.
Once the electron's spin was discovered, it was clear that the magnetism should be due to a large number of electron spins all oriented in the same direction. It was natural to ask how the electrons' spins all know which direction to point in, because the electrons on one side of a magnet don't directly interact with the electrons on the other side. They can only influence their neighbors. The Ising model was designed to investigate whether a large fraction of the electron spins could be oriented in the same direction using only local forces.
A coarse model is to make space-time a lattice and imagine that each position either contains an atom or it doesn't. The space of configuration is that of independent bits Bi, where each bit is either 0 or 1 depending on whether the position is occupied or not. An attractive interaction reduces the energy of two nearby atoms. If the attraction is only between nearest neighbors, the energy is reduced by −4 JB i B j for each occupied neighboring pair.
The density of the atoms can be controlled by adding a chemical potential, which is a multiplicative probability cost for adding one more atom. A multiplicative factor in probability can be reinterpreted as an additive term in the logarithm – the energy. The extra energy of a configuration with N atoms is changed by μN. The probability cost of one more atom is a factor of exp(− βμ).
So the energy of the lattice gas is:
Rewriting the bits in terms of spins,
For lattices where every site has an equal number of neighbors, this is the Ising model with a magnetic field h = ( zJ − μ)/2, where z is the number of neighbors.
In biological systems, modified versions of the lattice gas model have been used to understand a range of binding behaviors. These include the binding of ligands to receptors in the cell surface, the binding of chemotaxis proteins to the flagellar motor, and the condensation of DNA.
Following the general approach of Jaynes, a later interpretation of Schneidman, Berry, Segev and Bialek, is that the Ising model is useful for any model of neural function, because a statistical model for neural activity should be chosen using the principle of maximum entropy. Given a collection of neurons, a statistical model which can reproduce the average firing rate for each neuron introduces a Lagrange multiplier for each neuron: But the activity of each neuron in this model is statistically independent. To allow for pair correlations, when one neuron tends to fire (or not to fire) along with another, introduce pair-wise lagrange multipliers: where are not restricted to neighbors. Note that this generalization of Ising model is sometimes called the quadratic exponential binary distribution in statistics. This energy function only introduces probability biases for a spin having a value and for a pair of spins having the same value. Higher order correlations are unconstrained by the multipliers. An activity pattern sampled from this distribution requires the largest number of bits to store in a computer, in the most efficient coding scheme imaginable, as compared with any other distribution with the same average activity and pairwise correlations. This means that Ising models are relevant to any system which is described by bits which are as random as possible, with constraints on the pairwise correlations and the average number of 1s, which frequently occurs in both the physical and social sciences.
The Sherrington–Kirkpatrick model of spin glass, published in 1975, is the Hopfield network with random initialization. Sherrington and Kirkpatrick found that it is highly likely for the energy function of the SK model to have many local minima. In the 1982 paper, Hopfield applied this recently developed theory to study the Hopfield network with binary activation functions. In a 1984 paper he extended this to continuous activation functions. It became a standard model for the study of neural networks through statistical mechanics.
where is an arbitrary branching ratio (greater than or equal to 2), , , (with representing the nearest-neighbor interaction energy) and there are k (→ ∞ in the thermodynamic limit) generations in each of the tree branches (forming the closed tree architecture as shown in the given closed Cayley tree diagram.) The sum in the last term can be shown to converge uniformly and rapidly (i.e. for z → ∞, it remains finite) yielding a continuous and monotonous function, establishing that, for greater than or equal to 2, the free energy is a continuous function of temperature T. Further analysis of the free energy indicates that it exhibits an unusual discontinuous first derivative at the critical temperature (, .)
The spin-spin correlation between sites (in general, m and n) on the tree was found to have a transition point when considered at the vertices (e.g. A and Ā, its reflection), their respective neighboring sites (such as B and its reflection), and between sites adjacent to the top and bottom extreme vertices of the two trees (e.g. A and B), as may be determined from where is equal to the number of bonds, is the number of graphs counted for odd vertices with even intermediate sites (see cited methodologies and references for detailed calculations), is the multiplicity resulting from two-valued spin possibilities and the partition function is derived from . (Note: is consistent with the referenced literature in this section and is equivalent to or utilized above and in earlier sections; it is valued at .) The critical temperature is given by
The critical temperature for this model is only determined by the branching ratio and the site-to-site interaction energy , a fact which may have direct implications associated with neural structure vs. its function (in that it relates the energies of interaction and branching ratio to its transitional behavior.) For example, a relationship between the transition behavior of activities of neural networks between sleeping and wakeful states (which may correlate with a spin-spin type of phase transition) in terms of changes in neural interconnectivity () and/or neighbor-to-neighbor interactions (), over time, is just one possible avenue suggested for further experimental investigation into such a phenomenon. In any case, for this Ising model it was established, that “the stability of the long-range correlation increases with increasing or increasing .”
For this topology, the spin-spin correlation was found to be zero between the extreme vertices and the central sites at which the two trees (or branches) are joined (i.e. between A and individually C, D, or E.) This behavior is explained to be due to the fact that, as k increases, the number of links increases exponentially (between the extreme vertices) and so even though the contribution to spin correlations decrease exponentially, the correlation between sites such as the extreme vertex (A) in one tree and the extreme vertex in the joined tree (Ā) remains finite (above the critical temperature.) In addition, A and B also exhibit a non-vanishing correlation (as do their reflections) thus lending itself to, for B level sites (with A level), being considered “clusters” which tend to exhibit synchronization of firing.
Based upon a review of other classical network models as a comparison, the Ising model on a closed Cayley tree was determined to be the first classical statistical mechanical model to demonstrate both local and long-range sites with non-vanishing spin-spin correlations, while at the same time exhibiting intermediate sites with zero correlation, which indeed was a relevant matter for large neural networks at the time of its consideration. The model's behavior is also of relevance for any other divergent-convergent tree physical (or biological) system exhibiting a closed Cayley tree topology with an Ising-type of interaction. This topology should not be ignored since its behavior for Ising models has been solved exactly, and presumably nature will have found a way of taking advantage of such simple symmetries at many levels of its designs.
early on noted the possibility of interrelationships between (1) the classical large neural network model (with similar coupled divergent-convergent topologies) with (2) an underlying statistical quantum mechanical model (independent of topology and with persistence in fundamental quantum states):
It was a natural and common belief among early neurophysicists (e.g. Umezawa, Krizan, Barth, etc.) that classical neural models (including those with statistical mechanical aspects) will one day have to be integrated with quantum physics (with quantum statistical aspects), similar perhaps to how the domain of chemistry has historically integrated itself into quantum physics via quantum chemistry.
Several additional statistical mechanical problems of interest remain to be solved for the closed Cayley tree, including the time-dependent case and the external field situation, as well as theoretical efforts aimed at understanding interrelationships with underlying quantum constituents and their physics.
Since every spin site has ±1 spin, there are 2 L different states that are possible.
The Hamiltonian that is commonly used to represent the energy of the model when using Monte Carlo methods is:
Furthermore, the Hamiltonian is further simplified by assuming zero external field h, since many questions that are posed to be solved using the model can be answered in absence of an external field. This leads us to the following energy equation for state σ:
Given this Hamiltonian, quantities of interest such as the specific heat or the magnetization of the magnet at a given temperature can be calculated.
When implementing the algorithm, one must ensure that g(μ, ν) is selected such that ergodicity is met. In thermal equilibrium a system's energy only fluctuates within a small range. This is the motivation behind the concept of single-spin-flip dynamics, which states that in each transition, we will only change one of the spin sites on the lattice. Furthermore, by using single- spin-flip dynamics, one can get from any state to any other state by flipping each site that differs between the two states one at a time. The maximum amount of change between the energy of the present state, Hμ and any possible new state's energy Hν (using single-spin-flip dynamics) is 2 J between the spin we choose to "flip" to move to the new state and that spin's neighbor. Thus, in a 1D Ising model, where each site has two neighbors (left and right), the maximum difference in energy would be 4 J. Let c represent the lattice coordination number; the number of nearest neighbors that any lattice site has. We assume that all sites have the same number of neighbors due to periodic boundary conditions. It is important to note that the Metropolis–Hastings algorithm does not perform well around the critical point due to critical slowing down. Other techniques such as multigrid methods, Niedermayer's algorithm, Swendsen–Wang algorithm, or the Wolff algorithm are required in order to resolve the model near the critical point; a requirement for determining the critical exponents of the system.
Specifically for the Ising model and using single-spin-flip dynamics, one can establish the following. Since there are L total sites on the lattice, using single-spin-flip as the only way we transition to another state, we can see that there are a total of L new states ν from our present state μ. The algorithm assumes that the selection probabilities are equal to the L states: g(μ, ν) = 1/ L. Detailed balance tells us that the following equation must hold:
Thus, we want to select the acceptance probability for our algorithm to satisfy
If Hν > Hμ, then A(ν, μ) > A(μ, ν). Metropolis sets the larger of A(μ, ν) or A(ν, μ) to be 1. By this reasoning the acceptance algorithm is:
The basic form of the algorithm is as follows:
The change in energy Hν − Hμ only depends on the value of the spin and its nearest graph neighbors. So if the graph is not too connected, the algorithm is fast. This process will eventually produce a pick from the distribution.
If h = 0, it is very easy to obtain the free energy in the case of free boundary condition, i.e. when Then the model factorizes under the change of variables
This gives
Therefore, the free energy is
With the same change of variables
hence it decays exponentially as soon as T ≠ 0; but for T = 0, i.e. in the limit β → ∞ there is no decay.
If h ≠ 0 we need the transfer matrix method. For the periodic boundary conditions case is the following. The partition function is The coefficients can be seen as the entries of a matrix. There are different possible choices: a convenient one (because the matrix is symmetric) is or In matrix formalism where λ1 is the highest eigenvalue of V, while is the other eigenvalue: and . This gives the formula of the free energy above. In the thermodynamics limit for the non-interaction case (J = 0), we got as the answer for the open-boundary Ising model.
If we designate the number of sign changes in a configuration as k, the difference in energy from the lowest energy state is 2 k. Since the energy is additive in the number of flips, the probability p of having a spin-flip at each position is independent. The ratio of the probability of finding a flip to the probability of not finding one is the Boltzmann factor:
The problem is reduced to independent biased . This essentially completes the mathematical description.
From the description in terms of independent tosses, the statistics of the model for long lines can be understood. The line splits into domains. Each domain is of average length exp(2β). The length of a domain is distributed exponentially, since there is a constant probability at any step of encountering a flip. The domains never become infinite, so a long system is never magnetized. Each step reduces the correlation between a spin and its neighbor by an amount proportional to p, so the correlations fall off exponentially.
The partition function is the volume of configurations, each configuration weighted by its Boltzmann weight. Since each configuration is described by the sign-changes, the partition function factorizes:
The logarithm divided by L is the free energy density:
which is analytic away from β = ∞. A sign of a phase transition is a non-analytic free energy, so the one-dimensional model does not have a phase transition.
The transverse-field model experiences a phase transition between an ordered and disordered regime at J ~ h. This can be shown by a mapping of Pauli matrices
Upon rewriting the Hamiltonian in terms of this change-of-basis matrices, we obtain
Since the roles of h and J are switched, the Hamiltonian undergoes a transition at J = h.
When is small, we have , so we can numerically evaluate by iterating the functional equation until is small.
obtained the following analytical expression for the free energy of the Ising model on the anisotropic square lattice when the magnetic field in the thermodynamic limit as a function of temperature and the horizontal and vertical interaction energies and , respectively
From this expression for the free energy, all thermodynamic functions of the model can be calculated by using an appropriate derivative. The 2D Ising model was the first model to exhibit a continuous phase transition at a positive temperature. It occurs at the temperature which solves the equation
In the isotropic case when the horizontal and vertical interaction energies are equal , the critical temperature occurs at the following point
When the interaction energies , are both negative, the Ising model becomes an antiferromagnet. Since the square lattice is bi-partite, it is invariant under this change when the magnetic field , so the free energy and critical temperature are the same for the antiferromagnetic case. For the triangular lattice, which is not bi-partite, the ferromagnetic and antiferromagnetic Ising model behave notably differently. Specifically, around a triangle, it is impossible to make all 3 spin-pairs antiparallel, so the antiferromagnetic Ising model cannot reach the minimal energy state. This is an example of geometric frustration.
where and are horizontal and vertical interaction energies.
A complete derivation was only given in 1951 by using a limiting process of transfer matrix eigenvalues. The proof was subsequently greatly simplified in 1963 by Montroll, Potts, and Ward using Szegő's limit formula for Toeplitz determinants by treating the magnetization as the limit of correlation functions.
In three as in two dimensions, Peierls' argument shows that there is a phase transition. This phase transition is rigorously known to be continuous (in the sense that correlation length diverges and the magnetization goes to zero), and is called the critical point. It is believed that the critical point can be described by a renormalization group fixed point of the Wilson-Kadanoff renormalization group transformation. It is also believed that the phase transition can be described by a three-dimensional unitary conformal field theory, as evidenced by Monte Carlo simulations, exact diagonalization results in quantum models, and quantum field theoretical arguments. Although it is an open problem to establish rigorously the renormalization group picture or the conformal field theory picture, theoretical physicists have used these two methods to compute the critical exponents of the phase transition, which agree with the experiments and with the Monte Carlo simulations. This conformal field theory describing the three-dimensional Ising critical point is under active investigation using the method of the conformal bootstrap. This method currently yields the most precise information about the structure of the critical theory (see Ising critical exponents).
In 2000, Sorin Istrail of Sandia National Laboratories proved that the spin glass Ising model on a nonplanar lattice is NP-completeness. That is, assuming P ≠ NP, the general spin glass Ising model is exactly solvable only in Planar graph cases, so solutions for dimensions higher than two are also intractable. Istrail's result only concerns the spin glass model with spatially varying couplings, and tells nothing about Ising's original ferromagnetic model with equal couplings.
|
|